Search | Global Index Medicus

Bioinformatics services for analyzing massive genomic datasets

Gunhwan KO; Pan-Gyu KIM; Youngbum CHO; Seongmun JEONG; Jae-Yoon KIM; Kyoung-Hyoun KIM; Ho-Yeon LEE; Jiyeon HAN; Namhee YU; Seokjin HAM; Insoon JANG; Byunghee KANG; Sunguk SHIN; Lian KIM; Seung-Won LEE; Dougu NAM; Jihyun-F. KIM; Namshin KIM; Seon-Young KIM; Sanghyuk LEE; Tae-Young ROH; Byungwook LEE; Gunhwan KO; Pan-Gyu KIM; Youngbum CHO; Seongmun JEONG; Jae-Yoon KIM; Kyoung-Hyoun KIM; Ho-Yeon LEE; Jiyeon HAN; Namhee YU; Seokjin HAM; Insoon JANG; Byunghee KANG; Sunguk SHIN; Lian KIM; Seung-Won LEE; Dougu NAM; Jihyun-F. KIM; Namshin KIM; Seon-Young KIM; Sanghyuk LEE; Tae-Young ROH; Byungwook LEE.

Genomics & Informatics ; : e8-2020.

Article in English | WPRIM | ID: wpr-898396

ABSTRACT

The explosive growth of next-generation sequencing data has resulted in ultra-large-scale datasets and ensuing computational problems. In Korea, the amount of genomic data has been increasing rapidly in the recent years. Leveraging these big data requires researchers to use large-scale computational resources and analysis pipelines. A promising solution for addressing this computational challenge is cloud computing, where CPUs, memory, storage, and programs are accessible in the form of virtual machines. Here, we present a cloud computing-based system, Bio-Express, that provides user-friendly, cost-effective analysis of massive genomic datasets. Bio-Express is loaded with predefined multi-omics data analysis pipelines, which are divided into genome, transcriptome, epigenome, and metagenome pipelines. Users can employ predefined pipelines or create a new pipeline for analyzing their own omics data. We also developed several web-based services for facilitating downstream analysis of genome data. Bio-Express web service is freely available at https://www.bioexpress.re.kr/.

Bioinformatics services for analyzing massive genomic datasets

Genomics & Informatics ; : e8-2020.

Article in English | WPRIM | ID: wpr-890692

ABSTRACT

Performance Comparison of Two Gene Set Analysis Methods for Genome-wide Association Study Results: GSA-SNP vs i-GSEA4GWAS

Ji-Sun KWON; Jihye KIM; Dougu NAM; Sangsoo KIM.

Genomics & Informatics ; : 123-127, 2012.

Article in English | WPRIM | ID: wpr-57571

ABSTRACT

Gene set analysis (GSA) is useful in interpreting a genome-wide association study (GWAS) result in terms of biological mechanism. We compared the performance of two different GSA implementations that accept GWAS p-values of single nucleotide polymorphisms (SNPs) or gene-by-gene summaries thereof, GSA-SNP and i-GSEA4GWAS, under the same settings of inputs and parameters. GSA runs were made with two sets of p-values from a Korean type 2 diabetes mellitus GWAS study: 259,188 and 1,152,947 SNPs of the original and imputed genotype datasets, respectively. When Gene Ontology terms were used as gene sets, i-GSEA4GWAS produced 283 and 1,070 hits for the unimputed and imputed datasets, respectively. On the other hand, GSA-SNP reported 94 and 38 hits, respectively, for both datasets. Similar, but to a lesser degree, trends were observed with Kyoto Encyclopedia of Genes and Genomes (KEGG) gene sets as well. The huge number of hits by i-GSEA4GWAS for the imputed dataset was probably an artifact due to the scaling step in the algorithm. The decrease in hits by GSA-SNP for the imputed dataset may be due to the fact that it relies on Z-statistics, which is sensitive to variations in the background level of associations. Judicious evaluation of the GSA outcomes, perhaps based on multiple programs, is recommended.

Subject(s)

Artifacts , Diabetes Mellitus, Type 2 , Genome , Genome-Wide Association Study , Genotype , Hand , Polymorphism, Single Nucleotide

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL